智能论文笔记

Concentration inequalities for correlated network-valued processes with applications to community estimation and changepoint analysis

Sayak Chatterjee , Shirshendu Chatterjee , Soumendu Sundar Mukherjee , Anirban Nath , Sharmodeep Bhattacharyya

分类： (统计)机器学习

2022-08-02

网络值时间序列是目前的网络数据的常见形式。然而，研究由网络价值随机过程产生的网络序列的总体行为相对较少。现有的大多数研究都集中在简单的设置上，其中网络在整个时间内是独立的（或有条件独立的），并且所有边缘在每个时间步骤均同步更新。在本文中，我们研究了聚集的邻接矩阵的浓度特性以及与懒惰网络值随机过程产生的网络序列相关的相应拉普拉斯矩阵，其中边缘异步不断地更新，并且每个边缘都遵循其懒惰的随机过程，以更新独立于其更新其他边缘。我们证明了这些集中度的有用性，从而证明了标准估计器在社区估计和变更点估计问题中的一致性。我们还进行了一项仿真研究，以证明懒惰参数的影响，该参数控制时间相关的程度，对社区和变化点估计的准确性。

translated by 谷歌翻译

An Adaptive Kernel Approach to Federated Learning of Heterogeneous Causal Effects

Thanh Vinh Vo , Arnab Bhattacharyya , Young Lee , Tze-Yun Leong

分类：机器学习 | 人工智能 | (统计)机器学习

2023-01-01

We propose a new causal inference framework to learn causal effects from multiple, decentralized data sources in a federated setting. We introduce an adaptive transfer algorithm that learns the similarities among the data sources by utilizing Random Fourier Features to disentangle the loss function into multiple components, each of which is associated with a data source. The data sources may have different distributions; the causal effects are independently and systematically incorporated. The proposed method estimates the similarities among the sources through transfer coefficients, and hence requiring no prior information about the similarity measures. The heterogeneous causal effects can be estimated with no sharing of the raw training data among the sources, thus minimizing the risk of privacy leak. We also provide minimax lower bounds to assess the quality of the parameters learned from the disparate sources. The proposed method is empirically shown to outperform the baselines on decentralized data sources with dissimilar distributions.

translated by 谷歌翻译

Functional Integrative Bayesian Analysis of High-dimensional Multiplatform Genomic Data

Rupam Bhattacharyya , Nicholas Henderson , Veerabhadran Baladandayuthapani

分类： (统计)机器学习

2022-12-29

Rapid advancements in collection and dissemination of multi-platform molecular and genomics data has resulted in enormous opportunities to aggregate such data in order to understand, prevent, and treat human diseases. While significant improvements have been made in multi-omic data integration methods to discover biological markers and mechanisms underlying both prognosis and treatment, the precise cellular functions governing these complex mechanisms still need detailed and data-driven de-novo evaluations. We propose a framework called Functional Integrative Bayesian Analysis of High-dimensional Multiplatform Genomic Data (fiBAG), that allows simultaneous identification of upstream functional evidence of proteogenomic biomarkers and the incorporation of such knowledge in Bayesian variable selection models to improve signal detection. fiBAG employs a conflation of Gaussian process models to quantify (possibly non-linear) functional evidence via Bayes factors, which are then mapped to a novel calibrated spike-and-slab prior, thus guiding selection and providing functional relevance to the associations with patient outcomes. Using simulations, we illustrate how integrative methods with functional calibration have higher power to detect disease related markers than non-integrative approaches. We demonstrate the profitability of fiBAG via a pan-cancer analysis of 14 cancer types to identify and assess the cellular mechanisms of proteogenomic markers associated with cancer stemness and patient survival.

translated by 谷歌翻译

BNSynth: Bounded Boolean Functional Synthesis

Ravi Raja , Stanly Samuel , Chiranjib Bhattacharyya , Deepak D'Souza , Aditya Kanade

分类：人工智能 | 机器学习

2022-12-15

The automated synthesis of correct-by-construction Boolean functions from logical specifications is known as the Boolean Functional Synthesis (BFS) problem. BFS has many application areas that range from software engineering to circuit design. In this paper, we introduce a tool BNSynth, that is the first to solve the BFS problem under a given bound on the solution space. Bounding the solution space induces the synthesis of smaller functions that benefit resource constrained areas such as circuit design. BNSynth uses a counter-example guided, neural approach to solve the bounded BFS problem. Initial results show promise in synthesizing smaller solutions; we observe at least \textbf{3.2X} (and up to \textbf{24X}) improvement in the reduction of solution size on average, as compared to state of the art tools on our benchmarks. BNSynth is available on GitHub under an open source license.

translated by 谷歌翻译

Objective Surgical Skills Assessment and Tool Localization: Results from the MICCAI 2021 SimSurgSkill Challenge

Aneeq Zia , Kiran Bhattacharyya , Xi Liu , Ziheng Wang , Max Berniker , Satoshi Kondo , Emanuele Colleoni , Dimitris Psychogyios , Yueming Jin , Jinfan Zhou

分类：计算机视觉

2022-12-08

Timely and effective feedback within surgical training plays a critical role in developing the skills required to perform safe and efficient surgery. Feedback from expert surgeons, while especially valuable in this regard, is challenging to acquire due to their typically busy schedules, and may be subject to biases. Formal assessment procedures like OSATS and GEARS attempt to provide objective measures of skill, but remain time-consuming. With advances in machine learning there is an opportunity for fast and objective automated feedback on technical skills. The SimSurgSkill 2021 challenge (hosted as a sub-challenge of EndoVis at MICCAI 2021) aimed to promote and foster work in this endeavor. Using virtual reality (VR) surgical tasks, competitors were tasked with localizing instruments and predicting surgical skill. Here we summarize the winning approaches and how they performed. Using this publicly available dataset and results as a springboard, future work may enable more efficient training of surgeons with advances in surgical data science. The dataset can be accessed from https://console.cloud.google.com/storage/browser/isi-simsurgskill-2021.

translated by 谷歌翻译

Archangel: A Hybrid UAV-based Human Detection Benchmark with Position and Pose Metadata

Yi-Ting Shen , Yaesop Lee , Heesung Kwon , Damon M. Conover , Shuvra S. Bhattacharyya , Nikolas Vale , Joshua D. Gray , G. Jeremy Leong , Kenneth Evensen , Frank Skirlo

分类：计算机视觉

2022-08-31

学习在无人驾驶汽车（UAV）捕获的图像中检测物体（例如人类）通常会遭受无人机对物体的位置造成的巨大变化。此外，现有的基于无人机的基准数据集不提供足够的数据集元数据，这对于精确的模型诊断至关重要，并且学习功能不变。在本文中，我们介绍了大天使，这是第一个基于无人机的对象检测数据集，该数据集由具有相似想象条件以及无人机位置以及对象姿势元数据捕获的真实和合成子集组成。一系列实验经过精心设计，使用最先进的对象检测器设计，以证明在模型评估过程中利用元数据的好处。此外，还提供了几种涉及模型微调过程中涉及真实和合成数据的关键见解。最后，我们讨论了有关大天使的优势，局限性和未来方向，以突出其对更广泛的机器学习社区的独特价值。

translated by 谷歌翻译

Verification and search algorithms for causal DAGs

Davin Choo , Kirankumar Shiragur , Arnab Bhattacharyya

分类：机器学习 | (统计)机器学习

2022-06-30

我们研究了与从介入数据中恢复因果图有关的两个问题：（i）$ \ textIt {verification} $，其中的任务是检查声称的因果图是否正确，并且（ii）$ \ textit {search} $，任务是恢复正确的因果图。对于这两者，我们都希望最大程度地减少执行的干预措施的数量。对于第一个问题，我们给出了一组最小尺寸的原子干预措施的表征，这些干预措施是必要且足以检查所要求的因果图的正确性。我们的表征使用$ \ textit {coving edges} $的概念，这使我们能够获得简单的证据，并且很容易理解早期结果。我们还将结果推广到有限尺寸干预措施和节点依赖性干预成本的设置。对于上述所有设置，我们提供了第一种已知的可验证算法，用于有效地计算（接近）一般图上的最佳验证集。对于第二个问题，我们给出了一种基于图形分离器的简单自适应算法，该算法会产生一个原子干预集，该集合在使用$ \ MATHCAL {O}（\ log n）$ times $ times所需的$所需干预措施时，该算法完全围绕任何必需图表。 \ textIt {verify} $（验证大小）$ n $顶点上的基础dag。相对于验证大小而言，此近似值是紧密的，因为$ \ textit {any} $搜索算法的最差情况是$ \ omega（\ log n）$的最差情况。使用有限的大小干预措施，每个大小$ \ leq k $，我们的算法给出了$ \ mathcal {o}（\ log n \ cdot \ log \ log \ log k）$ factor actialation。我们的结果是第一种已知的算法，该算法对一般未加权图和有界尺寸干预的验证尺寸提供了非平凡的近似保证。

translated by 谷歌翻译

SSL-Lanes: Self-Supervised Learning for Motion Forecasting in Autonomous Driving

Prarthana Bhattacharyya , Chengjie Huang , Krzysztof Czarnecki

分类：计算机视觉 | 人工智能 | 机器人

2022-06-28

自我监督学习（SSL）是一种新兴技术，已成功地用于培训卷积神经网络（CNNS）和图形神经网络（GNNS），以进行更可转移，可转换，可推广和稳健的代表性学习。然而，很少探索其对自动驾驶的运动预测。在这项研究中，我们报告了将自学纳入运动预测的首次系统探索和评估。我们首先建议研究四项新型的自我监督学习任务，以通过理论原理以及对挑战性的大规模argoverse数据集进行运动预测以及定量和定性比较。其次，我们指出，基于辅助SSL的学习设置不仅胜过预测方法，这些方法在性能准确性方面使用变压器，复杂的融合机制和复杂的在线密集目标候选优化算法，而且具有较低的推理时间和建筑复杂性。最后，我们进行了几项实验，以了解为什么SSL改善运动预测。代码在\ url {https://github.com/autovision-cloud/ssl-lanes}上开源。

translated by 谷歌翻译

Knowledge Graph Construction and Its Application in Automatic Radiology Report Generation from Radiologist's Dictation

Kaveri Kale , Pushpak Bhattacharyya , Aditya Shetty , Milind Gune , Kush Shrivastava , Rustom Lawyer , Spriha Biswas

分类：自然语言处理 | 人工智能

2022-06-13

从传统上讲，放射科医生准备诊断笔记，并与转录师分享。然后，抄写员准备了指参考票据的初步格式报告，最后，放射科医生审查报告，纠正错误并签字。该工作流程在报告中导致重大延迟和错误。在当前的研究工作中，我们专注于NLP技术（例如信息提取（IE）和域特异性知识图（KG））的应用，以自动从放射科医生的命令中生成放射学报告。本文通过从现有的自由文本放射学报告的大型语料库中提取信息来重点介绍每个器官的KG构造。我们开发了一种信息提取管道，将基于规则的，基于模式和基于词典的技术与词汇语义特征相结合，以提取实体和关系。可以从kgs访问简化的丢失信息，以产生病理描述，并因此是放射学报告。使用语义相似性指标评估了生成的病理描述，该指标与金标准病理描述显示了97％的相似性。另外，我们的分析表明，我们的IE模块的性能要比放射学域的开放式工具更好。此外，我们还包括放射科医生的手动定性分析，该分析表明80-85％的生成报告是正确编写的，其余部分是正确的。

translated by 谷歌翻译

PIDNet: A Real-time Semantic Segmentation Network Inspired from PID Controller

Jiacong Xu , Zixiang Xiong , Shankar P. Bhattacharyya

分类：计算机视觉 | 人工智能

2022-06-04

两个分支网络体系结构显示了其对实时语义分割任务的效率和有效性。但是，低水平细节和高级语义的直接融合将导致一种现象，即周围的上下文信息很容易被详细特征淹没，即本文中的超声波，这限制了现有的两种分支模型的准确性的提高。在本文中，我们桥接了卷积神经网络（CNN）与比例综合衍生物（PID）控制器之间的联系，并揭示了两个分支网络不过是一个比例综合（PI）控制器，它固有地来自于此。类似的过冲问题。为了减轻这个问题，我们提出了一个新颖的三个分支网络架构：Pidnet，它分别拥有三个分支来分析详细的，上下文和边界信息（语义的导数），并采用边界关注来指导详细和背景的融合在最后阶段的分支。 PIDNET家族在推理速度和准确性之间实现了最佳的权衡，其测试准确性超过了所有存在的模型，这些模型在CityScapes，Camvid和Coco-STUFF数据集上具有相似的推理速度。尤其是，Pidnet-S在CityScapes测试套装上以93.2 fps的推理速度达到78.6％，在CAMVID测试集上速度为153.7 fps，速度为80.1％。

translated by 谷歌翻译